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Informally, an extractor delivers perfect randomness from a source that may be far away from the 
uniform distribution, yet contains some randomness. This task is a crucial ingredient of any attempt 
to produce perfectly random numbers — required, for instance, by cryptographic protocols, numer- 
ical simulations, or randomised computations. Trevisan's extractor raised considerable theoretical 
interest not only because of its data parsimony compared to other constructions, but particularly 
because it is secure against quantum adversaries, making it applicable to quantum key distribution. 

We discuss a modular, extensible and high-performance implementation of the construction based 
on various building blocks that can be flexibly combined to satisfy the requirements of a wide range 
of scenarios. Besides quantitatively analysing the properties of many combinations in practical 
settings, we improve previous theoretical proofs, and give explicit results for non-asymptotic cases. 
The self-contained description does not assume familiarity with extractors. 
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I. INTRODUCTION 

Random numbers are key ingredients for many pur- 
poses concerning communication or computation: se- 
cretly shared, perfectly random bit strings enable two 
parties to communicate in private using a one-time pad, 
without the possibility of a third party decrypting any of 
the messages they exchange. Stochastic algorithms used 
in numerical simulation or machine learning also rely on 
random numbers as part of their input. In all such ap- 
plications, it is usually best to have uniform random- 
ness available, that is, an observer should not have prior 
knowledge about the distribution of numbers, or, more 
specifically, the content of bit strings: Each string should 
be equally probable from his point of view. Some appli- 
cations, such as encrypting a message with information- 
theoretic security, are even impossible if the randomness 
used to choose the key is not equivalent to a uniform dis- 
tribution pp. Unfortunately, despite their usefulness and 
the need for them, uniformly distributed random bits are 
almost impossible to generate in practice. On the other 
hand, there are plenty of physical resources containing 
"some" randomness, for instance radioactive decay, ther- 
mal fluctuations, certain measurements on photons, or 
many others. 

This contrast motivates the study of randomness ex- 
tractors: Functions that map longer, slightly random bit 
strings onto shorter, perfectly random bit strings. They 
convert an initial distribution of random numbers (the 
source) that satisfies certain assumptions on "how ran- 
dom" it is into an almost uniform distribution over the 
output bit strings. As suggested by intuition, this is im- 
possible in a completely deterministic way [2], and ex- 
tractors indeed require a second source of randomness, 
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the seed, that is usually assumed to be perfectly uni- 
formly distributed. 

The goal of this work is twofold: to implement a spe- 
cific randomness extractor devised by Trevisan in 1999 [3 
as a practical companion to the abundant amount of 
theoretical literature on the subject, and to provide an 
overview and guidance on the topic to experimental- 
ists who need to use extractors, but would not benefit 
from working through all fundamental publications. Tre- 
visan's construction has three particular advantages: For 
one, it is secure in the presence of quantum side infor- 
mation, as was shown by one of the authors in collabora- 
tion with others pQ. This is especially important in the 
context of cryptography, where an adversary usually has 
some prior information about the initial distribution used 
as raw material to produce a secret key. With a quantum- 
proof extractor, it is possible to eliminate all these unde- 
sired correlations by turning the initial distribution into 
a uniform one — a task referred to as privacy amplifi- 
cation. With quantum key distribution (QKD) systems 
gradually transitioning from research labs into commer- 
cial applications, it is very important to implement this 
crucial protocol step, and given a bound on how random 
and how correlated with some (quantum-)memory a bit 
string is, the algorithm can indeed perform the task of 
producing truly random and uncorrelated bits with the 
help of a short seed of uniformly distributed bits. 

Another crucial advantage of Trevisan's construction 
is that the required seed length is only poly-logarithmic 
in the length of the input. This greatly outperforms ran- 
domness extractors based on (almost) universal hashing, 
which are currently most often used in quantum cryp- 
tographic applications [H |6], but require a seed whose 
size unfortunately scales with the length of the raw in- 
put (output) bits. 

In addition, Trevisan's construction is a strong extrac- 
tor, which means that the seed is almost independent of 
the final output. This implies that randomness in the 
seed is not consumed in the process (compared to weak 
extractors) and can be used at a later time — or, as in 



2 



the case of privacy amplification, it can be obtained by 
the adversary without compromising the security of the 
QKD scheme. 

Despite the considerable theoretical attention the field 
of extractors has received during the last decade, there is, 
to our knowledge, only a single publication, Ref. [7], that 
discusses a prototypical implementation of Trevisan's 
construction. However, their work has some drawbacks: 
Compared to Ref. [7J, our implementation offers greater 
flexibility as the operator can combine various different 
building blocks that make up the extractor, and so can 
specifically engineer an algorithm for his needs. Com- 
paring the performance, our implementation exceeds the 
throughput of [7J by several orders of magnitude, and is 
for the first time able to scale to data sets of realistic size 
(exceeding the maximal amount considered in [7] by 10 
orders of magnitude) for which the amount of extracted 
randomness actually exceeds the size of the initial seed, 
which marks the regime in which Trevisan's construc- 
tion prevails over two-universal hashing. Besides, the full 
source code of our implementation is available^ and can 
be inspected and used as basis for further research. We 
therefore hope that our implementation will be of use for 
applications in the context of quantum cryptography, for 
implementing random number generators, or as a testbed 
for developing new ideas about extractors. 

In Section [TTJ we give more proper descriptions and 
definitions of the involved concepts and constructs. In 
particular, we discuss the necessary notions of entropy 
and the distance of a distribution from uniform (relative 
to an overserver). However, no prior knowledge about 
randomness extractors is assumed. Section IHII contains 
the necessary technical details, and can be skipped upon 
first reading. Section |IV| is devoted to the implemen- 
tation: It describes the software architecture and dis- 
cusses some important technical details, explains how to 
add new components, and gives concise algorithmic de- 
scriptions of all components. In Section [Vj we present 
comprehensive performance measurements, and discuss 
which combinations of primitives are useful for which 
purpose. The appendices collect formal definitions, pro- 
vide known extractor results with explicitly spelled out 
constants that are, in contrast to many discussions that 
rely on asymptotic notations, vital for an implementa- 
tion, and give proofs for some new propositions developed 
in the paper. 



II. OVERVIEW 

A. What are extractors? 

There are numerous possibilities to produce random 
numbers, and many of them rely on some random phys- 
ical process, turning, for instance, thermal fluctuations 
into random bit strings. The laws of physics state that 
these processes produce distributions with a non-zero en- 
tropy, and hence are somewhat random. But the uni- 
form distribution or maximal entropy case is most often 
not within reach: thermal fluctuations, for instance, re- 
quire infinite temperatures to produce truly random bit 
strings. It is therefore necessary to have an algorithm 
that extracts random numbers from some given initial 
distribution satisfying a lower bound on its entropy, turn- 
ing them into uniformly distributed ones. By shrinking 
the bit string (i.e., reducing the support of its distri- 
bution), we increase its randomness until it achieves its 
maximum. It is easy to see that such a task is impossible 
for any deterministic routine [SJ. But assuming that we 
have two distributions (seed and source) over bit strings 
at our disposal, promised to be uncorrelated and fulfill- 
ing a lower bound on their entropy, the task comes into 
reach. Such algorithms are called randomness extractors, 
and their general structure is shown in Figure [I] The ad- 
ditional randomness is usually taken to be uniform, and 
is called the seecQ A natural aim is to seek algorithms 
that minimise the required size of the seed, or in other 
words, the amount of additional randomness. Extractors 
depend on several parameters, specifying source, seed, 
and output. This section explains the different param- 
eters and how they are quantified, and discusses their 
connection. In the second part, we briefly outline Tre- 
visan's construction. 



2 For simplicity we also treat the case of a uniform seed in this 
work, but some variations of Trevisan's extractor still apply when 
the additional randomness just fulfills a lower bound on its en- 
tropy and so the methods and code that we have developed 
can also be adapted to this setting. 



1 The sources are available under the terms of the GNU General 
Public License (GPL), version 2 — see www.gnu.org Essentially, 
this means that the code can be used and modified free of charge 
for research (or even commercial) work, provided that any im- 
provements to the code are made available under similar terms. 
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Weakly random source 

FIG. 1. Scenario considered in randomness extraction: The 
output of a weakly random source, together with a short, uni- 
formly distributed seed, is processed by a deterministic algo- 
rithm — the extractor — that produces a uniform sequence of 
randomness that is shorter than the initial input. For strong 
extractors, the initial seed is independent of the output and 
can be re-used as part of the result. 

First, let us consider how to quantify the amount of 
randomness contained in the source. As per the seminal 
works of Boltzmann, Shannon and von Neumann, the 
amount of randomness contained in some distribution of 
numbers is best quantified by its entropy, traditionally 
given as ^2 x p x ^ogp x . Here, x ranges over all output val- 
ues, and p x is the probability to observe outcome x. This 
notion of entropy originates from statistical mechanics, 
where we deal with large numbers of independent entities 
that are usually also identically distributed. In contrast 
to that, we are interested in a single run of our extrac- 
tor, and not in statements about the output distribution 
obtained from many instances of the extractor applied 
to many independent copies of the initial distribution. 
Consequently, we have to alter the notion of entropy. 

Intuitively, the amount of randomness in some distri- 
bution is quantified by the ability to predict the observed 
values. This leads to the definition of the guessing prob- 
ability Pg UeS s(^0 as the probability of correctly guess- 
ing the value of the random variable X. It is given by 
Pguess(A) = m&x x p x — the optimal strategy is to guess 
the most probable value. The bigger p gU ess, the less ran- 
dom the source is. This is quantified by the min- entropy, 
defined by H min (X) = - log max^ p x . 

The definition does so far not consider the possible 
presence of side information. In a more complex setting, 
there might be some side information E correlated to 
the source X, and the task becomes to extract uniform 
randomness from X that is independent of E. In a cryp- 
tographic context, E represents the adversary's informa- 
tion about the source. Clearly, if E is a one-to-one copy of 
X, this task is impossible, even if X is perfectly uniform. 
The notions of guessing probability and entropy conse- 
quently need to be extended such that they measure the 
randomness of the source conditioned on E. If the side 
information is classical, then extractors proven sound in 
the absence of side information can be used with only 



a small adjustment of the parameters)^] However, this 
changes dramatically if the observer is allowed to use the 
power of quantum mechanics [S]. 

To guess the value of X, a player holding a quan- 
tum state in a system E may measure this system, and 
make a guess based on the observed outcome. For ev- 
ery value X = x, his quantum memory is in some con- 
ditional state p Xl and his task reduces to distinguish- 
ing the different states p x . Mathematically, such a 
measurement is specified by a positive operator-valued 
measurement {E x }. Thus, the probability to correctly 
guess the value taken by X is given by p guess (X\E) = 
J2 x Px t r PxE x . The corresponding entropic quantity, the 
conditional min-entropy [5], is given by H m i n (X\E) = 
— log J2 X Px tr p x E x , where we take {E x } to be the opti- 
mal measurement. 

Having specified the quantification of randomness, we 
need to define what we mean by an "almost uniform" 
distribution over the output Z = Ext (A, Y), where X is 
the source and Y the seed. Again intuitively, we would 
like to assure that a player holding some side information 
cannot do better than with a random guess, that is, the 
probability that he guesses correctly should be close to 
2^- if the output is a bit string of length m. Mathemat- 
ically, this is expressed by requiring that the joint state 
of the output and the side information pze is close to a 
product state of a perfectly uniform output, r — the fully 
mixed state — and the side information pe, that is, we 
want pze ~ t <S> Pe- The distanc^] between these states 
is usually denoted by e, and referred to as the error of 
the extractor. Colloquially, an error of e corresponds to 
a probability of at most + e that the output can be 
guessed correctly, and a probability of at most | + e that 
any single bit can be guessed. 

We are now able to define extractors in more detail. 
We assume that the input are bit strings of length n and 
that the distribution has a conditional min-entropy of at 
least k. For processing each input string, d randomly 
distributed bits may be used. The output should consist 
of bit strings having length m, and the distribution of 
outcomes should be e-close to uniform and independent 
from the side information. We call a deterministic func- 
tion taking as input the source and the seed and achieving 
these goals a quantum-proof (k, e)-extractoi^ The out- 
put length of such an extractor is m. Naturally, we would 
like to have m as close to k as possible, which means 
that most of the entropy has been extracted. The value 
k — m is therefore called the entropy loss. The extractor 
is called strong if the output is also close to independent 
of the value of the seed, or equivalently, the output of the 
extractor is a pair of bit strings, the first being the value 
of the random bits used as seed, and the second being 



3 See Lemma 



A.3 



for an exact statement. 
4 We use the trace distance to measure how close two states are, 
see Appendix [ A] fo r an exact definition. 
See Definition A. 2 for a formal definition. 
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the output. This is exactly the setting needed for the 
privacy amplification step in quantum key distribution 
protocols, as both Alice and Bob need to use the same 
value for the seed, in order to produce a correlated bit 
string. It is thus assumed that the bit values for the seed 
are uniformly distributed, but known to the adversary, 
since they are publicly announced by one of the parties. 

The most commonly used (at least in theoretical con- 
siderations) strong extractor in quantum key distribution 
protocols is based on two-universal hash functions [6] . 
A family is a collection of functions that map longer bit 
strings to shorter bit strings. Over a random choice of the 
function from the set, two-universality requires that it is 
extremely unlikely for different bit strings to be mapped 
to the same output. While universal hashing is optimal 
in the entropy loss^J the required seed length (the size of 
the function family: as many bits as necessary to ran- 
domly select one member) scales as a multiple of n, the 
input data length (or, in the case of almost universal 
hashing [6 , as a multiple of to, the output length). 

It is important to emphasise that strong extractors 
provide security just based on an entropic assumption, 
namely the amount of (conditional) min-entropy of the 
initial distribution. In contrast, pseudo-random number 
generators are based on complexity theoretic assump- 
tions. For instance, the presumed existence of functions 
that are hard to invert on average in polynomial time 
can be turned into an algorithm taking a short random 
seed and producing an output distribution that "looks" 
like the uniform distribution to any algorithm running in 
polynomial time (see Ref. |10j for further information and 
formal definitions). While such generators greatly out- 
perform our current implementation)^] they require much 
stronger assumptions and give rise to weaker promises on 
the output distribution. 

After this general discussion on extractors and related 
issues, we now describe Trevisan's construction in more 
detail. 

B. Trevisan's Construction 

Trevisan's seminal contribution originates in the in- 
sight that a certain class of error-correcting codes (ECC), 
called list-decodable codes |13j . can be re-interpreted as 



6 An extractor will always have an entropy loss A > 2 log 1/e + 
O(l), where e is the error of the extractor [9]. 

7 Practical implementations of pseudo-random number generators, 
among them the variant used in the Linux kernel rely on 
cryptographic hash functions like SHA-512 | 12| . Since these func- 
tions, in turn, are used in numerous computing scenarios that ex- 
tend well beyond cryptography, many recent CPUs offer special- 
purpose machine instructions that allow for particularly efficient 
implementations. This makes it practically impossible for an im- 
plementation of Trevisan's construction to beat the throughput 
of cryptographic hash algorithms that are, besides, much simpler 
from an algorithmic point of view. 



extractors. In fact, the codes are one-bit extractors, and 
deliver a single perfectly random bit from a larger reser- 
voir of slightly random bits. Since an error correcting 
code is a deterministic mapping from shorter into longer 
bit strings to make them more robust against the influ- 
ence of errors acting on the encoded data, the connection 
between ECCs and bit extractors is not immediately ob- 
vious. Trevisan's first observation was that if we ran- 
domly select a position of an ECC's output string, the 
corresponding bit is uniformly distributed, provided that 
the initial distribution has enough min-entropy. If the 
code outputs bit strings of length h — poly(n), a loga- 
rithmic long seed of random bits is needed, since exactly 
logn bits are necessary to specify a position of an n-bit 
string. 

Of course, we are interested in much longer outputs 
than just a single bit. The second observation of Tre- 
visan states that outputs of many uses of the one-bit 
extractor can be concatenated so that the output is still 
uniformly distributed, and that we do not need to choose 
a completely new set of random seed bits for every use of 
the one-bit extractor. This is achieved using a construc- 
tion of Nisan and Wigderson [T3], the Nisan-Wigderson 
pseudo-random generator. The basic idea is that the ini- 
tial choice of random bits taken from the seed is divided 
into sets of random bits with small overlap. For example, 
100 random bits are divided into 15 sets, each consisting 
of 10 bits. If the overlap is not too large, there are not 
too many correlations induced by seeding the elements of 
each set into the one-bit extractors and concatenating the 
output bits. The randomness available in the initial dis- 
tribution can then be used to cope with these additional 
correlations. Dividing the original seed bits into smaller 
sets is done using an algorithm called weak design. The 
complete process is summarised in Figure [2] 

It turns out that there are many examples of one-bit 
extractors and weak designs that fulfil the requirements 
needed for the above procedure to work. Trevisan's con- 
struction is therefore not really a single algorithm, but 
rather a recipe to combine different one-bit extractors 
and weak designs to generate a quantum-proof strong ex- 
tractor. The exact choice of either building block (we also 
refer to them as primitives in the following) depends very 
much on the application and on the parameter regime of 
interest. Consequently, we decided to implement differ- 
ent possible choices and let the operator decide which 
ones to use. We now present two exemplary use cases 
that do especially highlight the need to prioritise be- 
tween speed, entropy loss, and the assumptions on the 
initial randomness. 

Suppose first that we have at our disposal a fast source 
providing very good random numbers, or equivalcntly, 
having a very high entropy. Ideally, we would like to 
extract all randomness, but since producing new ran- 
dom numbers is fast, we can allow a substantial entropy 
loss, concentrating on performance instead. In this case, 
the combination of the GF(p)-weak design with the XOR 
one-bit extractor is the best choice, achieving a through- 
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Extracted uniform randomness 
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Weakly random source 

FIG. 2. Interplay of components in Trevisan's extractor. The 
weak design expands the initial seed into multiple smaller 
packets with certain overlap properties whose cumulative 
length can considerably exceed the seed length. Each is fed 
into a 1-bit extractor B, which distills a single random bit out 
of the global randomness for each packet. The bits are finally 
concatenated to form the extracted randomness. All compo- 
nents that make up Trevisan's algorithm are highlighted in a 
grey box. 



put of about 17 kbit/s on a normal notebook machine and 
about 160 kbit/s on a large workstation with 48 cores. 
The extractor can handle input lengths of several GiBits, 
which is also necessary since in these extreme cases only 
one percent of the available entropy is extracted. This 
means that the source needs to provide random numbers 
with a rate of about 20 Mibit/s. The required amount 
of seed for the one-bit extractor is 1.7 KiBit, which leads 
to a total seed of roughly 2.9 MiBit for 4 GiBit of input 
data. 

If we consider a source of very low entropy and focus on 
small entropy loss rather than throughput, the optimal 
choice turns out to be the block weak design together 
with polynomial hashing for the one-bit extractor. It 
works for any lower bound on the entropy, has almost 
minimal entropy loss, and requires the shortest seed of 
all constructions. It is, however, much slower than the 
first combination: A throughput of only a few kbit/s 
is achieved on a notebook computer, or 70 kbit/s with 
48 cores, albeit for a much shorter input length of 2 16 
bits: 100 bits are necessary for the one-bit extractor, 
which results in 10 KiBit of total seed for the standard 
weak design, and slightly less than 300 KiBit for the block 
weak design needed to extract nearly all the entropy. 

These are just two examples, and proper performance 
measurements as well as a discussion on possible improve- 
ments and aspects of high-performance computing can be 
found in section fVl 



III. DERIVATIONS 
A. Trevisan's extractor 

1. Description 

As briefly sketched in the previous section, Trevisan's 
construction consists in applying multiple times the same 
one-bit extractor to the input string, using different 
weakly correlated seeds for each run. The seeds are cho- 
sen as substrings of some longer seed y € {0, l} d . Let 
{Si}i be a family of sets such that for all i, \Si\ = t 
and Si C [d] = {1, . . . , d}. Then yg. — the string formed 
by the bits of y at the positions given by the elements 
j G Si — is a string of length t. For a given one-bit ex- 
tractor C : {0, 1}" x {0, 1}* -> {0, 1}, and such a family 
of sets {Si}™^!, Trevisan's extractor is defined as the con- 
catenation of the output bits of C when used with the 
seeds ysi, namely 



Ext(x,y) := C(x,y Sl ) ■ ■ -C(x,y Sm ). 



(1) 



The performance of the extractor naturally depends on 
the performance of the one-bit extractor, but also on the 
independence of the seeds used for each run of the one- 
bit extractor. Intuitively, the smaller the cardinality of 
the intersections of the sets {Si}, the more randomness 
we can extract form the source, but the larger the seed. 
The exact condition is given in the following definition. 



Definition III.l (weak design [15jj). A family of sets 
Si, ... , S m C [d] is a weak (m, i, r, d)-design if 

1. For all i, \S t \ = t. 

2. For all i, Y*Z\ 2^ nS ^ < rm. 

In the following, we refer to the parameter r as the 
overlap of the weak design. 

As an example, if we use a quantum-proof (k, e)-strong 
extractor as one-bit extractor and a weak (m, t, r, d)- 
design, the construction given by ([!]) is a quantum-proof 
(k + rm, me)-strong extractor (see Lemma B.8). Thus, if 
r = 1, Trevisan's extractor has roughly the same entropy 
loss as the underlying one-bit extractor. Note also that 
the error e of the one-bit extractor is the error per bit for 
Trevisan's construction. 



2. Constructions overview 

We always denote the input length by n, and the out- 
put length by m. We choose e in such a way that it 



The second condition of the weak design was originally defined 
as y~]jZ.\ 2' S J n,s *' < rim — 1). We prefer to use the version of 
[16] . since it simplifies the notation without changing the design 
constructions. 
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corresponds to the error per bit for the final construc- 
tion. We use d to describe the seed length of Trevisan's 
extractor and t for the seed length of the underlying one- 
bit extractor, r denotes the overlap of the weak design, 
and k the min-entropy required in the source, which is 
often expressed as k = an. 

In the following, we briefly summarize the construc- 
tions described in Sections IIII Bl and MI CI We take the 
input length n, output length to, and error per bit e to be 
fixed, and calculate the seed length d and entropy needed 
in the source k as functions of these three parameters. 

a. Weak designs: In Section |IIIB| we describe two 
weak designs, the first was originally proposed by Nisan 
and Wigderson [17], and has parameters d = t 2 and r = 
2e for any prime power t and any to. This means that 
the seed of the final construction is the square of the seed 
of the one-bit extractor, and the entropy loss induced by 
the weak design is (2e — l)m ~ 4.436to. The second 
construction iterates the first; it has a larger d — at 2 for 



log(m - 2e) - log(* - 2e) 
log2e-log(2e-l) 



(2) 



but r — 1, i.e., the design does not cause any entropy 
loss. 

b. XOR-code: The XOR-code is a one-bit extractor, 
which simply computes the XOR of a substring of the 
input. With the two different weak designs, we find that 
the randomness and seed needed are 



k = jn + rm - 
2 In 2 



6 log 



f 



log . 



h-H-r) 

d = t 2 or at 2 , 



log n log 



(2 + V2) 



where 7 is a free parameter that influences the amount of 
extracted randomness and the length of the initial seed 
(details in Section |III C 1 1 , and a is given by Eq. ^ . 
h(p) = — plogp — (1 — p) log(l — p) is the binary entropy 
function, and its inverse is defined on the interval (0, 1/2). 

c. Lu's construction: This one-bit extractor selects 
a random substring of the input by performing a walk 
on an expander graph, and then hashes the result to one 
bit by taking the parity of the bitwise product with a 
random string. With the two different weak designs, we 
find that the randomness and seed needed are 



2 ~1~ v 2 

k = h(v)n + rm + 6 log 2, 

£ 



t = \ogn + 3c(£-l)+£, 
log w 
21og5\/2/8 

81oge - 81og(2 + \/2) 
log(l — v + w) 

d = t 2 or at 2 , 



where v < 1/2 is a free parameter, a is given by Eq. p]), 
and w is the solution to the equation^] 

wlogw = (1 — v + w) log(l — v + w). 

d. Polynomial hashing: This constructions uses al- 
most universal hash functions. With the two different 
weak designs, we find that the randomness and seed 
needed are 

k = rm + 4 log — h 6. 

2' 



t = 2 



log n + 2 log ■ 



d = t 2 or at 2 , 
where a is given by ([2]). 
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B. Weak designs 



weak design construction we use (see Sec- 



III B I for a description) is originally from Nisan 



and Wigderson [17], who proved that it is a standard 
design — a notion stronger than weak designs, originally 
used by Trevisan [3J, but which Raz et al. [TS] showed 
to be unnecessary. Hartman and Raz |16j proved that 
this construction is a weak (to, t, r, <i)-design with over- 
lap r = e 2 for a prime t, d — t 2 , and to a power of t. 
Ma and Tan [TB] improved Hartman and Raz's analysis, 
and showed that r = e for any prime power t and any 
to which is a multiple of a power of t. However, for a 
practical implementation, we need a construction that is 
valid for any to. We prove in Appendix [UT] that this con- 
struction is a weak (m, t, r, e?)-design for any prime power 
t, any to, and r = 2ep"] 

As mentioned in Section |III A[ a larger overlap leads 
to a larger entropy loss. In Section [HI B 2 we adapt an 
iterative construction of the basic design from Ma and 
Tan [18], to construct a new design with r = 1. We 
prove in Appendix |C 2| that this construction is correct. 



1. Basic construction 

In this section we describe a weak design construction, 
that is, we define a family of sets that satisfy the condi- 
tions of Definition IIII. II 



9 w < v can actually be chosen freely. The above value minimizes 

the walks on the expander graph. 
10 Hartman and Raz | 161 Corollary 2] show that there exist a 
d = 0(t 2 ) and r = O(l) such that for any m > t logt the con- 
struction is a (m, t, r, d)-design, however the restriction m > t log * 
and constants in the O-notation which depend on m make this 
unusable in practice. Ma and Tan |18| conjecture that the basic 
construction is a weak (m, t, e, t 2 )-design for any m, and use this 
in their implementation. To make up for the lack of proof, they 
simply count the intersections between the sets Si after gener- 
ating the design, to make sure that the overlap is bounded by 
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This construction makes use of polynomials over a fi- 
nite field GF(i). Every set S p is indexed by one such 
polynomial p : GF(t) — > GF(t). To construct a weak 
(to, t, r, (i)-design we need to sets, and therefore to such 
polynomials, which we take in increasing order of their 
coefficients. For example, if to = 6 and t = 2, the poly- 
nomials are X^=o ai2;i > w ith the coefficients (pt.2, oti, ao) 
taken in the following order: (0,0,0), (0,0,1), (0,1,0), 
(0,1,1), (1,0,0), (1,0,1). In general, the nth polynomial 
is given by p(x) = Yli=o a i x % with cti = (n— l)/t* mod t 

andc=pg^-l]. 

The elements of the set S p are all the pairs of val- 
ues S p := {(x,p(xj) : x G GF(i)}. Each set thus has 
\S P \ = t elements, and S p C [d] holds for d = t 2 , where 
we map [d] to [t] x [t] in the obvious way. We prove in 
Appendix | C l| that for all to and p, this construction has 
J2{q< P } 2l s ^ TO «l < 2eTO, where by {q < p} we mean the 
set of all polynomials that come before p. 




2. Reducing the overlap 



8 64 128 256 512 

1-Bit extractor seed length 



Note that any weak design can be viewed as a bi- 
nary (to x (i)-matrix W , where the value Wij = 1 if 
j £ Si. To construct a weak design with r = 1, we will 
use the construction from Section [DTBT] repeatedly with 
different values rrij (but the same t), obtaining designs 
Wb,o, ■ • ■ , Wb,£- We then construct a new design W by 
placing these in its diagonal, that is, 



FIG. 3. Total seed length required for the weak block design 
(based on an elementary design with r = 2e) depending on 
the seed length of the 1-bit extractor. The inset shows the 
number of blocks - since the seed size for the block design 
depends linearly on the block number, this is also the seed 
overhead associated with the use of a block design. For typical 
parameters, the amount of seed grows by a factor of 50. 



W 



(W B ,o 

V 



IF, 



Let to and t be fixed, and let r' = 2e be the parameter 
from the basic construction. The number of calls to the 
basic construction is given by 



I := max< 1 



log (to — r') — log(i — r') 
logr' — log(r' — 1) 



(3) 



And each design Ws,i is constructed with rrij sets, defined 
as follows: 



m := 1 - 



1 



3=0 



m 

i-l 

E' 

3=0 



- 1 



for < i < 



forO < i < I- 1, 



(4) 



me := to 



l-\ 

E m i- 

3=0 



The weak design W thus has d = (£ + l)t 2 . In Ap- 



pendix C 2 we prove that this construction has r = 1. 

Figure [3j discusses the parameter behaviour of the 
block weak design. 



C. One-bit extractors 

1. XOR-code 

This extractor computes the XOR for i random po- 
sitions of the input, it is thus an £- local extractor (see 
Appendix [A] for a precise definition). This construc- 
tion is efficient to compute, but requires a seed of length 
t € 0(lognlog 1/e)), where n is the input length and 
e the error of the construction, instead of the optimal 
0(logn + logl/e). It also has an entropy loss linear in 
the input length. 

Lemma III.2 (XOR-code QH Theorem 4lQ. For any 
e > 0, n € N and £ € [n], the function 



C n , M : {0, 1}" x 

(Ml) •••! 



{0,1} 



|19l Theorem 41] actually proves that this construction is a 8- 
approximately (e, L)-list-decodable code. But such a code is an 
(e, L2 ,l ( <s ) n )-list-decodable code, which in turn is a classical-proof 
extractor by Lemma|B.3| 



is a classical-proof (.-local (fc, e) -strong extractor with k = 
h(^r- log |)n + 31og ^ +log § and seed length t = £\ogn, 
where h(p) — —p\ogp — (1 — p) log(l — p) is the binary 
entropy function. 

By Lemma |B.5| this construction is a quantu m-pro of 

" if 
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(k, (1 + v / 2)v / e)-strong extractor. And by Lemma 
we use this in Trevisan's construction, the final extrac 
tor is a quantum-proof (k + rm, m(l + v / 2)v / e)-strong 
extractor. 

Let our source have min-entropy H m i n (X\E) = an. 
We want the entropy loss induced by this one-bit extrac- 
tor to be roughly 771, and need to find the appropriate ( 
for the desired value of 7 since j(£, e) = hP^r log |) for 
some 7 < a. Solving for (, we find £(7, e) = j^tj^s l°g f • 
This implies that 7 directly influences the length of the 
seed, which we discuss below. Since the inverse binary 
logarithm is not analytically available, we need to 

resort to numerical techniques to determine the appro- 
priate value of ( for a given 7. It is convenient to dis- 
tinguish the experimental entropy deficiency a from the 
loss induced by the extraction procedure by introducing 
a parameter (i such that 7 = /ia. 

For e = (Y+yl)^' Trevisan's construction is a quan- 
tum (fc, e'm)-strong extractor with k = 771 + rm + 
1 f — + log I . The seed of the one-bit extractor 



6 log ■ 

has length t = (\ogn — 



2 In 2 



log n log 



. (2+72) 



and the 



seed of the complete construction has length d. 

Especially the choice of fi influences the behaviour of 
the XOR extractor. Figures|4j[5j and|6]depict and discuss 
the effect of the various chosen and inferred parameters. 



2. Lu's construction 

Lu [20] shows how to construct a local one-bit extrac- 
tor, i.e., an extractor for which each bit of the output 
only depends on a subset of the input bits. He then uses 
his one-bit extractor in Trevisan's construction. Here, 
we adapt the parameters of his construction to build a 
quantum-proof extractor. 

Lu's extractor proceeds in two steps. The first consists 
in selecting a substring of the input; the second hashes 
this string to one bitj^j To select the substring of the 
input, he performs a random walk on a <?-regular graph — 
a graph in which every vertex is connected to exactly g 
other vertices. 

Recall that a graph G is uniquely identified by its ver- 
tices and edges, and is consequently specified by G — 
(V, E), where V is the vertex set and E the edge set. An 
alternative representation of more importance in our con- 
text is the adjacency matrix. For a graph with n vertices, 
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FIG. 4. XOR parameter behaviour overview, /i varies per 
column, e per row. The influence of the initial source entropy 
a is mostly negligible, especially for small values of the ex- 
traction ratio fi. However, there is a drastic increase in seed 
size for /1 > 0.16, which restricts the XOR method to a prac- 
tical upper bound of the extraction ratio of about 10% of the 
source entropy. 



this is an n x n matrix in which the entry denotes the 
number of edges from vertex i to vertex j. The diagonal 
is typically filled with ones; since the graphs considered 
here are undirected (i.e., the direction of edges is not 
taken into account, only the fact that two vertices are 
connected), the adjacency matrix is symmetric. 

The eigenvalues of the adjacency matrix are referred 
to as eigenvalues of the graph. For our purpose, the ratio 
between the second largest and largest eigenvalue plays 
an important role, and is labelled as A. Graphs with 
a small A are called expander graphs, and are common 
objects in pseudo-randomness generation, see Ref. [22] 
for a review. 

For an input string of length n, we choose a graph with 
n vertices, so that each vertex corresponds to a bit posi- 
tion of the string. Let (v\, . . . , vg) be the vertices visited 
during a walk of ( steps. We select the ( corresponding 
bits of the input x, that is, (x Vl , . . . , x Ve ), and then hash 
it by computing the parity of the bitwise product of this 
string with a random seed ft 6 {0, 1} £ [^] The output is 
thus z = i=1 (3iX Vi . 

Lu [20] proves that the concatenation of the output 
bits z for all possible seeds is a (6, L)-list decodable code 



12 This type of construction is sometimes referred to as sample- „ 

then-extract [21], although Lu [20] simply describes it as a local This hash faction is also used in Section 

list-decodable code. 
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XOR extractor: Seed to extraction ratio 




n (# of input bits) [log. scale] 

FIG. 5. Ratio between output size and seed length for various 
input sizes. The parameters \i = 0.05 and e = 10~ 7 are fixed 
for this computation. A ratio of 1, indicated by a dashed red 
line, denotes the parameter regime where the amounts of ex- 
tracted bits and the seed spent coincide; for values exceeding 
this threshold, the particular advantages of Trevisan's con- 
struction over two-universal hashing prevail because the ratio 
is better than what can be achieved with the latter approach. 
Recall, though, that the seed acts as a catalyst that can be 
included in the final randomness since the extracted bits are 
independent of the seed. 

The initial source entropy a accounts for a variation of about 
one order of magnitude of the extraction threshold. As a rule 
of thumb, the break-even point is at input sizes of roughly of 
10 9 bits, which amounts to approximately 2 30 bytes (roughly 
1 GiB) of data. 

The inset shows the number of extracted bits less the seed 
spent. 



with 



f or a v < 1 /2 given by 

^ = l + A 2 -<Sl (6) 

Since Si < 1, ^ can only be satisfied if A 2 < v. 
This can be obtained by taking as expander graph G a 
given construction Go to the power c. G is defined as 
the graph with adjacency matrix A — Aq, where Aq is 
the adjacency matrix of Go- We then have A = Aq. A 
random walk of length £ on Gp is equivalent to a random 
walk of length £c on Go , in which only the first of every 
c steps is remembered, and the others deleted |23j . 

To construct the regular expander graph Go, we em- 



XOR extractor: Break-even points 




0.005 0.020 0.040 0.060 0.080 0.100 0.120 0.140 

Extraction fraction n 



FIG. 6. Break-even points (i.e., minimal input length for 
which the amount of extracted randomness exceeds the re- 
quired seed size) for varying values of and e. The parameter 
a is fixed to 0.8. As a rule of thumb, /i = 0.05 is close to the 
optimal value irregardless of the error parameter e). 

ploy an algorithm reviewed in Ref . |22| . Let us only sum- 
marise the essential facts here: 

• The construction is restricted to degree 3 = 8, and 
the ratio between the second-largest and largest 
eigenvalue can be shown to be A = 5%/2/8 w 0.884. 

• It is possible to compute the graph for all dimen- 
sions (i.e., number of nodes) that can be expressed 
as I 2 for £ £ N. This restriction is much more re- 
laxed than for other constructions, and does not 
pose any problems in real applications. Formally, 
the vertex set of the graph is defined on TLn x TL g . 
Each vertex (x, y) £ %t x %i is connected to the 
vertices (x ± 2y,y), (x ± (y + 1), y), (x,y ± 2x), 
and (x,y ± (2x + 1)), which uniquely defines the 
edges. Notice that the arithmetic must be per- 
formed modulo £, so the computationally (compar- 
atively) cheap additions and multiplications are un- 
fortunately accompanied by an expensive modulo 
division^ 

• The complete graph does not need to be computed 
in advance, but can be constructed during the ran- 
dom walk, and using a constant amount of space. 



An obvious optimisation possibility that is available because the 
multiplicative factor 2 is small is to compute the modulo division 
not unconditionally, but only when the intermediate result really 
exceeds I. 
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For a given v, we choose c and I which minimize the 
number of steps c£. By setting w := Ag c and taking the 
derivative of cl with respect to w, we find that minimum 
is obtained for the w which is the solution of the equation 



Lu extractor: Interplay between v and u. 



W log W = (1 — V + w) log(l — V + w). 



The number of steps on the expander g 


by c = 


log w 


and I = 


4 log S 


2 log A 


log(l— v-^w) 



The walk on the g-regular graph requires nb(n) bits 
of seed to choose the first vertex, and c(l — 1) log g bits 
for the direction of the walk for each following step. The 
final hashing uses I bits of seed, for a total of t = nb(n) + 
c(e-l)\ogg + £. 

From Lemma B.3 and Lu's one-bit extractor is 
a classical-proof (h (v)n + 3 log i — 2, 2r5)-strong extrac- 
tor. By Lemma B.5 it is quantum-proof (h( u)n + 

when 



B., 



3 log | - 2, (2 + V^jVS). And from Lemma 
used with a weak (to, t, r, <i)-design, Trevisan's construc- 
tion is a quantum-proof (fc, me)-strong extractor with 

k = h{v)n + m + 6 log - 2. 

Unfortunately, Lu's construction is not useful in a prac- 
tical setting owing to its unfortunate parameter scaling: 
The number of random walk steps increases considerably 
with decreasing parameter v, see Figure [7] However, as 
Figure [8] shows, small values of v are required for even 
tiny extraction fractions. Overall, this makes the con- 
struction reach parameter realms where it is preferable 
over two-universal hashing functions (namely, when the 
length of the extracted bits exceeds the amount of initial 
seed) only rarely, as Figure [9] shows. 
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FIG. 8. Dependency between parameters v and p for Lu's 
construction. The largest value of v that satisfies the bound- 
ary conditions on the available entropy given in Section |"lII C 2| 
was determined numerically. 



3. Polynomial hashing 

Renner [5] proved that universal hash functions^] are 
good extractors. Tomamichel et al. [6 showed that the 
same holds for r5-almost universal (r5-AU2) hash func- 
tions, given that S is small enough. For the range of 
6 that build good extractors, almost universal hashing 
requires a seed of length f2(m + logn), where n is the 
input and m the output length. This seed is too large for 
many applications; however in the case of one-bit extrac- 
tors, this reduces to O(logn), and is achievable with the 
construction we describe here. 

This construction is in fact the concatenation of two 
hash functions, and uses a seed of length 21, where I 
will be specified later. The first is known as polynomial 
hashing — or alternatively as a Reed- Solomon code, be- 
cause the concatenation of the hashes for all seeds corre- 
sponds to the encoding of the input with a Reed-Solomon 
code. We partition the input string x € {0, 1}™ in blocks 
x = [x\, . . . , x s ), each of length I (if necessary, we pad the 
last string x s with 0s) . We view each block as an element 
of a field Xi € GY{2 t ), and evaluate the polynomial 



10 4 10 s 10 e 10 7 10 8 1 9 1 4 10 5 10 6 10 7 10 8 10 9 10 4 10 5 10 6 10 7 10 B 10 s 

# output bits (m) [log. scale] 



Pa(x) 



Xia s 



FIG. 7. Number of random walk steps required in Lu's con- 
struction in dependent on the output length and the param- 
eter v. 



See Appendix B 3 for a definition of (almost) universal hashing 
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FIG. 9. Comparision of seed size and output size for Lu's 
construction. The interesting regime is only reached in very 
rare cases. 



where a € GF(2^) is the first half of the seed. This family 
^-AUa. M 



IS "2T-AU2. 

Since the S of polynomial hashing is too large (relative 
to the output length) to build an extractor, we combine it 
with another hash function — sometimes referred to as a 
Hadamard code, as the concatenation of the outputs over 
all seeds corresponds to the Hadamard encoding. This 
hash function computes the parity of the bitwise product 
of p a (x) and the second half of the seed, (3 £ {0, 
The output is thus z — PiPa(x)i- Since this hash 

function is |-AU2, by [531 Theorem 5.4] the combination 
of the two is 6- AU 2 with S — \ + ■ 

Choosing I — [logn + 2 log 1/e'] , s = \n/£] we get 



t- 1 1 
~¥~ < 2 



n 1 

¥~2 



ji 



FIG. 10. Polynomial hashing parameter overview (calcula- 
tions are for r = 2e). The parameters are easy to evaluate 
because there is no dependence on a, and there is also no ex- 
traction factor /j, — the extractor works equally well for high- 
and low-entropy sources. The required seed is consistently 
small (shown in the bottom inset); it increases linearly as s 
decreases exponentially. 

The degree of the polynomial that needs to be evaluated is the 
crucial factor. Even for small inputs like n = 10 6 , correspond- 
ing to roughly 1 MiB of data, the degree is w 10000. Since the 
polynomial needs to be evaluated for every extracted bit, this 
makes the polynomial hashing extractor an unsuitable choice 
for performance intensive scenarios. 

The top inset shows the regime in which the extractor delivers 
more bits than initially invested for the seed. It outperforms 
two-universal hashing for a very wide range of parameters. 



IV. IMPLEMENTATION 



A. Implementation Architecture 



From Theorem 



B.7 



this is a quantum-proof (4 log | + 
2, 2e')-strong extractor. And plugging this in Trevisan's 
construction with a (to , 21, r, rf)-design and e — 2e', we 
get from Lemma B.8 a quantum-proof (4 log - + 6 + 
rm, me)-strong extractor. The seed of the one-bit ex- 
tractor has length t = 21 — 2 [log n + 2 log 2/e] , and the 
seed of the complete construction has length d. 

Figure [10] discusses the parameters of the polynomial 
hashing extractor. 



We now turn our attention to describing the imple- 
mentation of the Trevisan extraction framework by first 
outlining the software architecture, that is, the high-level 
conceptual point of view, followed by a discussion of some 
important implementation details and notes on how to 
add new primitives to the infrastructure. While many 
important details are still omitted for the sake of brevity, 
the full source code is available at https : //github. com/ 
wolf gangmauerer/libtrevisan for inspection and mod- 
ification. Besides instructions on how to build the code, 
the website also contains detailled information on how 
to use the program, which we will not discuss here any 
further. 
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1. Architecture 

The architecture was designed to satisfy two particu- 
lar constraints: Correctness and maximum throughput. 
To achieve the latter, we use C+-+P'1 to implement all 
performance-critical parts, since the language is stati- 
cally compiled and does not require any intermediate 
layers that add runtime penalties to interpreted or byte- 
compiled languages like, for instance, Matlab, but still 
allows us to maintain a clean and extensible design based 
on modern software engineering techniques [26j . The im- 
plementation is portable across a wide range of machines 
from laptops to high performance computing (HPC) ma- 
chines, and also provides opportunities to benefit from 
low-level capabilities of recent CPUs, for instance to ac- 
celerate bit-level manipulations. We have tested the code 
on Linux and MacOS machines. 

To ensure correctness of the calculations, we base the 
implementation on independent libraries (NTL |27j and 
OpenSSL [55] for working with finite fields of arbitrary 
size) that can be selected at compile timej^] Checking 
that both variants arrive at the same results for identi- 
cal parameter sets increases the faith in the reliability of 
the calculations. Another means to ensure code correct- 
ness is given by a large number of invariants and sanity 
checks that are spread all across the implementation. To 
not compromise the performance goals, it is possible to 
deactivate the checks at compile time so that they incur 
no runtime penalty. 

Another major design decision is the focus on multi- 
core machines: Nowadays, machines with only a sin- 
gle core are a rare exception, and algorithms that 
are limited to only one thread of execution voluntar- 
ily sacrifice a large fraction of the available computa- 
tional power, which is obviously not desirable in a high- 
performance setting. We use the threading building 
blocks library [21] as basis for the implementation, which 
allows for fine-tuning the distribution of work across the 
system ressources in a precise manner. We also employ 
a mostly lock-free architecture (see, e.g., Ref. [3U] for a 
review) that avoids any computation stalls due to the 
need for synchronised communication between computa- 
tion elements. 

The code also contains parts that are not performance- 
critical, for instance calculating the parameters from 
given user settings. This is conveniently done in very 



We rely on numerous features of the new language standard 
CH — hll, so at the time of writing, only sufficiently new com- 
pilers are able to build the code. 

NTL cannot be used in scenarios with high performance require- 
ments since it is restricted to running on one single core per 
design, which does not agree well with contemporary machine 
architectures. It can only be used in a single primitive that re- 
quires operations on GF(2 a: ) because the library operates with a 
single, global irreducible polynomial, which makes it effectively 
impossible to operate on fields of different dimensions simulta- 
neously. 



high-level languages that allow for working in abstract 
terms without having to consider any details of the un- 
derlying machine architecture. To this end, we have in- 
tegrated the possibility to call code written in the R lan- 
guage (using the techniques provided by Ref. [3T] ; see [32] 
for an overview about R), which enjoys widespread use 
in statistical data processing and machine learning. 

It is also possible to compute the weak design ahead of 
time, store it on disk and re-use it for multiple runs of the 
extractor — since computing the weak design is a deter- 
ministic operation that does not require any randomness, 
this is admissible to do. In matrix representation, a weak 
design for output length m and a total seed length d is an 
element F™ xd . Each row contains t ones and d — t zeroes, 
so the matrix fill for the standard design is ,™* « 1/t. 

G m{d—t) I 

A total seed of 50 KiBit, for instance, amounts to a fill 
of about 0.5%, which exceeds the threshold for typical 
sparse matrix techniques to pay off |33j . We found the 
data transfer times from the underlying block device to 
be longer than the time required to compute the weak 
design on the fly, albeit this may change with the avail- 
ability of high-speed storage. For the block weak design, 
the situation is more favourable since only the basic de- 
sign needs to be stored, and the remaining elements can 
be reconstructed with very little computational effort. 

Finally, we emphasise that the code can either be used 
in stand-alone mode (also including a dry-run mode for 
parameter estimation) , or as a library as part of a larger 
project. 

2. Implementation details 

Weak designs and one-bit extractors are imple- 
mented as C++ classes derived from mixed interface/ 
implementation-type base classes. Trevisan's algorithm 
solely operates on the base class objects using dynamic 
polymorphism, and does not require any knowledge 
about the internal structure of the primitives. 

The source code contains full information on how to 
implement and integrate new primitives, so we only sum- 
marise briefly what methods need to be provided. 

Weak designs need to be derived from class weakdes, 
and must implement 

• compvrte_Si (uint64_t i, vector indices) 

compute the ith index set, and store the results in 
indices. 

• compute_d() — compute the required amount of 
initial seed. 

• get_r() — report the overlap r to the higher-level 
algorithms. 

Optionally, the function set_params (uint64_t , 
uint64_t m) can, but need not be implemented to 
initialise the parameters required for all weak designs. 
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Determining d from t seems straightforward, but is ac- 
companied by constraints — the GF(2 X ) based weak de- 
sign, for instance, only works for values of t that can be 
represented as a power of 2, so the design typically needs 
to choose larger values (resulting in more initial seed) 
than requested. 

One-bit extractors need to be derived from class 
bitext, and must implement 

• num_random_bits() — compute the amount t of ini- 
tial seed bits required for every extracted bit. 

• compute_k() — determine the minimal source en- 
tropy required by the extractor for the parameter 
set under consideration. 

• extract (void *initial_rand) — extract one bit 
using the provided subset of the initial randomness. 

There are also generic functions to assign global ran- 
domness and other generic parameters to the 1-bit ex- 
tractor. They can, but need not be provided by an im- 
plementation. 

On the lower layers, the implementation was designed 
to use elementary machine arithmetic (as opposed to 
software-based multi-precision arithmetic) whenever pos- 
sible; this is an obvious precondition for an implementa- 
tion with good performance. In all performance critical 
operations, logarithms are not computed using floating 
point, but with integer operations since usually only floor 
or ceiling of the result is required. 

The code uses a fixed-width integer data type with 
64 bits to represent potentially large quantities like the 
number of input bits. It is important to note that the 
width of the index data type sets an upper bound on the 
amount of randomness that can be handled by the code, 
namely to 2 W ~ 3 bytes (for w — 32 respectively w = 64), 
which corresponds to 2 W bits (the datum is used as an 
index into a bit field, and this field need not be repre- 
sentable by a machine quantity). Since contemporary 
64-bit machines cannot handle more than 2 48 bytes ow- 
ing to virtual address space management limits |llj . the 
choice does not introduce any additional limits. To pro- 
cess large amounts of randomness (multiple gigabytes), 
64 bit machines and a 64-bit kernel running on the ma- 
chine are required, which the code assumes to be the 
default setting. 

B. Algorithms 

In the following, we give a concise description of 
all algorithms in a form that is helpful for actual 
implementations — in some contrast to the previously 
given descriptions that focus more on mathematical clar- 
ity, we provide recipes in a pseudo-formal language that 
is close enough to many contemporary imperative and 
object-oriented programming languages, yet still suffi- 
ciently abstract to avoid hiding the algorithmic core be- 
hind technical side-work. Although each algorithm can 



be captured with very few statements, we remark that 
a practical implementation needs to account for many 
non-trivial technical issues; our reference implementa- 
tion published as a part of this paper comprises about 
5000 lines of source code. 



1. Trevisan's extractor 

The Trevisan algorithm is independent of the type of 
weak design and bit extractor used; only the inferred 
parameters depend on the specific properties of the com- 
ponents: 

1: procedure Trevisan(WD, Ext, n, m, fi, a, e, g\ g d ) 



2: t <— Ext.InputSize(n, m, [a, a, e) 

3: d <- WD.InputSize(t) 

4: Reserve space for m bits in g° 

5: Reserve space for t numbers e [d] in S 

6: for i i — 0, Tn — 1 do > Data parallel 

7: Si- WD.computeS(z) 

8: b <- 

9: for j <- 0, t — 1 do 

10: bj i- Q l s . > Indices refer to bits 

11: end for 

12: g° 4- Ext.extract(6, g d ) 

13: end for 

14: return g° 



15: end procedure 

The components WD and Ext may impose boundary 
conditions on the parameters; for instance, the single-bit 
seed length t must be a power of a prime number for the 
weak designs implemented in this paper. 

2. Weak Designs 

a. Construction of Hartman and Raz The weak de- 
sign of Hartman and Raz is based on evaluating poly- 
nomials over finite field; recall from Section [ill A 2| that 
the dimension of the field needs to be a power of a prime 
number. We have implemented two variants: One based 
on the extension field F = GF(2 X ), and one based on the 
prime field F = GF(p). The bit extractors can require 
arbitrary values of t that are not necessarily compatible 
with the constraints of the weak design. In this case, t 
needs to be increased to the next possible value t' that 
can be provided by the weak design. Consequently, we 
need to distinguish between t, which represents the value 
that can be provided by the weak design, and i rcq , which 
is the value originally requested by the bit extractor. It 
necessarily holds that t > t lcq . 

The basic algorithm for both finite fields is as follows 
(indices in square brackets denote bit selections): 
l: procedure HR.ComputeS(F, i, m, t) 
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3: for j <— 0, c do > Prepare polynomial coefficients 
4: ctj 4— i[j ■ nb(t),j + nb(t) — 1] mod t 

5: end for 

6: for a = 0, a < t lc( i do 

7: 6 • V , atjOp 

8: 5?[l0g(t),2-l0g(t)-l]^6 

9: 5? <-S?-a 

10: <- S? mod |F| 

11: end for 

12: return 5 
13: end procedure 

For a field of prime dimension p, all calculations are 
performed modulo p. Notice, though, that it is not suffi- 
cient to simply divide by p after any multiplication (or ad- 
dition/subtraction) has been performed, because this can 
easily lead to intermediate results that exceed the maxi- 
mal bit width available in hardware. Multiplying two 40- 
bit numbers, for instance, can result in an 80-bit value, 
which exceeds the word size of 32 and 64 bit machines. A 
naive solution could fall back to using arbitrary-precision 
software arithmetic, which is unfortunately much slower 
than native machine hardware arithmetic. Consequently, 
we use have made sure to use algorithms that avoid in- 
termediate overflows and can work with multiplicands of 
up to 61 bits, which is sufficient for our purposes. See 
the source code or Ref. [31] for details. 

For the extension field GF(2 m ), it is not sufficient to 
perform a simple division of arithmetic results by a scalar 
to satisfy the constraints of the finite field. Instead, all 
elements of the field are formally interpreted as polyno- 
mials over the binary field, and arithmetic operations are 
performed modulo an irreducible polynomial that needs 
to be constructed dependent on the field order. It can be 
shown (see, e.g., Ref. [27]) that for every field order, an 
irreducible polynomial of order 3 or 5 exists, so calcula- 
tions can be optimised for these cases. 

b. Block Weak Design The block weak design is 
based on a basic design whose matrix representation is 
re-used multiple times as part of the total weak design — 
once the matrix representation of the basic design is 
known, it is possible to construct the complete design 
by placing sub-matrices of the basic design matrix on 
the diagonal of a larger matrix. One possible implemen- 
tation could thus use sparse matrix techniques to store 
the basic design in memory, and derive all other blocks 
from this representation. 

When the basic design is not represented by a ma- 
trix, but as vectors of indices, it is possible to compute 
the content of from the basic design row Wg by 

adding j ■ t 2 to all values of the set S corresponding to the 
matrix row. Since it is possible to re-arrange the rows of 
W without changing the properties of the weak design, 
we use a suitable permutation (derived from the data in 
Eq. @, see the source code for details) of the rows of 
W such that all rows that originate from the same row 
of the basic design are adjacent to each other, which al- 



lows us to cache calls to the basic construction. Since 
the design is traversed from row to row in the Trevisan 
algorithm, the permuted row order minimises calls to the 
basic construction. 
1: procedure BWD.COMPUTeS(WD, i,i c 7 S c ,t) 



2: Infer j, k from i 

3: if k ^ i c then 

4: i c 4- k 

5: S «- WD.computeS(i c ) 

6: for C <- 0, t — 1 do > Fill cache 

7: S c ( <- S c 

8: end for 

9: else 

10: for C <- 0, t - 1 do 

11: S ( ^S c c +j-t 2 

12: end for 

13: end if 

14: return S 



15: end procedure 



3. 1-Bit extractors 

Finally, we discuss the algorithms used for the 1-bit 
extractors implemented as part of this paper. 

a. XOR Code An implementation of the XOR code 
requires to derive the parameter I from the experimen- 
tal parameters; since this can be achieved by a standard 
numerical optimisation, we will not discuss a formal al- 
gorithm here, but refer the reader to the source code for 
the details. The algorithm itself is compact: 
l: procedure XO R. extract ( g\ g d ) 



2: r <- 

3: for i <- 0, 1 — 1 do 

4: C <- Q^i ■ nb(n - 1), (i + 1) • nb(n - 1) - 1 

5: r-S-r®p s [C] 

6: end for 

7: return r 



8: end procedure 

b. Polynomial Hashing The algorithm to perform 
polynomial hashing based on a concatenation of a Reed- 
Solomon and a Hadamard code is as follows: 
l: procedure RSH. extract^ 1 , g d , n, e) 



2: C^0, Z<- flogn + 21og2/e] 

3: s <- \n/V\ 

4: Pick irreducible polynomial for GF(2 z ) 

5: for i <— 0, s — 1 do > Determine coefficients 
6: a <- g s [i + 

7: end for 

8: a g d [0 : I — 1] > Reed-Solomon step 

9: r <- J2 s i=1 aa s - 1 > Computed over GF(2 ; ) 

10: b > Hadamard step 
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for j <s— 0, 1 — 1 do 

b <- 6© (^p + j] 
end for 



14: return & 
15: end procedure 

Since the length of the global randomness is consider- 
ably exceeds the bit length of the largest quantity repre- 
sentable with elementary machine data types in all but 
the most pathological cases, evaluation of the polynomial 
has to be performed using arbitrary precision software 
arithmetic. 

There are two obvious optimisations: The global ran- 
domness does not change across invocations of the RSH 
extractor, so it is possible to compute the coefficients 
of the polynomial once, and re-use the results in sub- 
sequent evaluations. In a practical implementation, it 
is also more efficient to use Horner's rule for evaluating 
the polynomial |35j instead of performing the straight- 
forward evaluation shown in the algorithm. 

The final parity calculation is not done using single- 
bit operations in the actual implementation, but is split 
into two steps: Firstly, the logical "and" operation is 
computed block-wise on machine-word sized blocks. Sec- 
ondly, the parity operation is built on special-purpose 
machine operations (or compiler intrinsics) to count the 
number of bits set in the result of the "and" operation. 
The parity can then be derived by checking if the bit 
count is even or odd. 

c. Lu's construction The algorithm for Lu's extrac- 
tor based on a random walk on an expander graph is as 
follows (we do not discuss how the optimisations required 
to determine the parameters c and I are performed; see 
the source code for details): 
l: procedure LU. extract^ 1 , g d , c, I) 
2: v <- q'[0 : C — 1] > Initial node 

3: r f- 0, 6 f- 3 > 3 bits to represent an edge 

4: W <- £>'[C : ( + c(l - 1) • b- 1] 

5: s <- &[( + c(l- 1) • b :} 

6: for !<-0,c-ldo 

7: r <- r © (g d [v] ■ s[i]) 

8: for j=0, 1-2 do > Random walk 

9: e <- w[(i(l-l)+j)-b, (i{l-l)+j+l)-b-l\ 

10: v 4— next . vertex(w, e) 

11: end for 

12: end for 

13: r «- r © (g d [v] ■ s[c]) 

14: return r 
15: end procedure 

C denotes the number of bits required to store the in- 
dex of a node. Function next . vertex(u, e) computes the 
value of the next vertex given the current vertex and the 
next edge; it is a straight-forward translation of the cal- 
culation rule given earlier in Section [ill C 2| 



Most of the implementation complexity for the Lu 
expander stems from the need to select subsets of bit 
strings. To simplify distributing the initial randomness 
provided by the weak design into three components as 
shown above, the actual implementation assumes that 
the contributions start on indices that are evenly divis- 
ible by the bit width of the data type used to represent 
edges. This simplifies the implementation, but implies 
that a slightly larger amount of randomness than theo- 
retically possible is required, albeit the increase is only 
by a negligible additive factor. 



V. RUNTIME COMPARISON 

Owing to the many aspects — throughput, scalability, 
weak design versus extractor performance, parameter 
ranges, machine characteristics, among others — involved 
in determining code performance, and because of the 
large number of combinations of primitives, it is neither 
possible nor reasonable to present measurements for all 
cases (since the full sources are available, measurements 
for a particular case of interest can be easily conducted 
by interested parties). Instead, we focus on a selection 
of measurements that describe cases of typical experi- 
mental interest. We use two machines to run the tests; 
detailled technical specifications are shown in Table [T] 
One machine is a standard Laptop (MacBook Air) that 
allows for testing the performance on an average personal 
computer, and serves as an apt comparison basis to the 
machine used for the measurements in Ref. [7J. The sec- 
ond machine is a sizeable workstation that gives an indi- 
cation for the behaviour in high-performance computing 
scenarios, or when one is willing to spend substantial 
computational effort on the post-processing, for example 
in scenarios in which the highest possible security is the 
foremost priority. 



Oh 



^3 IS 



a? S3 a 



Machine 
Laptop 



CPU 

Intel Core i5 
1.6 Ghz 



< 



=fc U H M Kernel 

1 2 2 4 4 Darwin 11.4.0 



Workstation AMD Opteron ^] 6 
1.9 Ghz 



32 Linux 3.0 



a Pairs of two CPUs share one socket 

TABLE I. Machines deployed in the benchmark measure- 
ments. 



The measurement results are shown in Figures 11 12 
131 14 and 15 refer to the captions for a detailled dis- 
cussion of the results. 
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Weak design 

block(gf2m) 
E3 block(gtp) 



2 10 2 ' 1 2* 2 2 n 2* 4 2 n 2 n 2" 2 n 

Number of input bits (log. scale) 



FIG. 11. Scaling behaviour of RSH with a block design for 
varying input lengths. For a small number of CPUs, perfor- 
mance degrades considerably with increasing input length, as 
expected for a non-local extactor. Good throughput (more 
than 100 kbit/s) is only obtained fo very small input sizes 
(2 12 is only 4KiBit of data!) for which the required amount 
of initial seed drastically exceeds the extracted amount of ran- 
domness. 

With many cores, the achieved speed-up does initially not 
compensate the overhead for setting up and performing par- 
allel operations, so the throughput increases to a local max- 
imum, and then decreases as expected with larger input 
lengths. Consequently, it is not just sufficient to add more 
CPUs for a given scenario to increase throughput; practical 
book-keeping tasks and technical aspects can easily dominate 
the actual problem. In particular, this implies that purely 
technical improvements like porting the processing to mas- 
sively parallel approaches like GPU computing will not au- 
tomatically resolve all performance needs; a proper choice of 
primitives for given requirements is essential, which is only 
possible with a framework that allows for flexibly combining 
these primitives. 



Throughput scaling (block(GF(2 x ), RSH) (40 iterations) 
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FIG. 12. Throughput comparison of our results (obtained on 
a laptop, represented by boxplots) with the results obtained 
by Ma et al. (represented by triangles) for the combination 
of primitives supported by their implementation. Since the 
code of [7] seems to be limited to running on one CPU core, 
we have also included an artificially contrained measurement 
measurement for the code discussed in this paper. Generally, 
our framework is 2-3 orders of magnitude faster in terms of 
throughput, and allows for dealing with inputs that surpass 
Ref. [7] by many orders of magnitude. 

applicable (mostly because short-seed extractors all suf- 
fer from a low extraction rate), the implementation can, 
for instance, satisfy the needs of all current quantum key 
distribution schemes. The authors hope that the public 
availability of the source code, together with the exten- 
sible architecture, will spawn contributions from other 
researchers to turn future theoretical progress into prac- 
tical results. 



SUMMARY 

We have presented a modular, scaleable implementa- 
tion of Trevisan's construction for randomness extrac- 
tion, together with detailled parameter derivations and 
improved mathematical proofs. We have shown that the 
feasibility or non-feasibility of Trevisan's scheme is not 
mainly a question of computational complexity issues, 
but does depend on the particular choice of primitives 
used as components of the algorithm; different scenar- 
ios require different constituents. Although our measure- 
ments indicate that there exist use cases that require the- 
oretical improvements to make Trevisan's construction 
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XOR+GF(p) throughput scaling (48 cores, 40 repetitions) X OR: n=2 x 1 6 (m=n/1 00); RSH: n=2 16 , m=2 15 (48 cores) 




Input bit length n [GiBit]. m=n/1 00 Number of CPU Cores 



FIG. 13. Scaling behaviour of XOR/GF(p) for increasing in- 
put size n. Although there is a marked decrease in perfor- 
mance for input lengths of more than 200 MiBit, a through- 
put of at least 160 kbits/s is sustained even for multi-GiBit 
input lengths (since the XOR extracor is local, there is no 
convergence towards a zero throughput rate for long inputs), 
and matches the requirements of typical quantum key distri- 
bution mechanisms curently under discussion. 
Since only 1% of the input is extracted, the code needs to deal 
with input data rates of 16-20 MiBit/s, making the primi- 
tives suitable to extract randomness from fast random num- 
ber sources — one example being, for instance, Ref. |36| . 



Appendix A: Extractor definitions 

An extractor Ext : {0, 1}" x {0, l} rf {0, l} m is a 
function which takes a weak source of randomness X and 
a uniformly random, short seed Y, and produces some 
output Fixt(X, Y), which is almost uniform. The extrac- 
tor is said to be strong, if the output is approximately 
independent of the seed. 

The distance from uniform is measured by the trace 
distance, defined as d(p,<r) :— |||/0 — cr|| tr , where || • ||t r 
denotes the trace norm given by ||A|| tr := trVA^A. 

Definition A.l (strong extractor [37]). A function Ext : 
{0, 1}™ x {0, l} d -» {0, l} m is a (k,e) -strong extractor, if 
for all distributions X with min-entropy H m i n (X) > k 



FIG. 14. Comparison of per-core performance for 
XOR/GF(p) and RSH/Block(GF(jj)) primitive combinations. 
The per-core throughput for the local XOR extractor drops 
to about 75 % of the sincle-Core performance for a very large 
number of cores (48), which makes it an excellent choice for 
massively parallel systems. For the RSH extractor, perfor- 
mance in the many-core case is only half of the performance 
of a single CPU, which can be attributed to the larger amount 
of data over which the primitive combination needs to iterate, 
and the subsequently increased load on the system busses. 



and a uniform seed Y, we havq 18 | 
1.. 

2\\PEXt{X,Y)Y ~ T u ® Py\Ui < £, 

where tjj is the fully mixed state on a system of dimen- 
sion 2 m . 

When (quantum) side information E about the source 
X is present, the randomness of the source is measured 
relative to this side information. We also require the 
output of the extractor to be close to uniform and inde- 
pendent from E. 

Definition A. 2 (quantum-proof strong extractor [551 
Section 2.6]). A function Ext : {0,1}™ x {0, l} d -> 
{0, l} m is a quantum-proof (or simply quantum) (k,e)- 
strong extractor, if for all states px e classical on X with 



A more standard classical notation would be 
g ||Ext(X, Y) o Y — U o Y\\ < e, where the distance metric 
is the variational distance. However, since classical random 
variables can be represented by quantum states diagonal in 
the computational basis, and the trace distance reduces to 
the variational distance, we use the quantum notation for 
compatibility with the rest of this work. 
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Scaling behaviour: Block(GF(p))+RSH on 48 core Opteron (n=2 , 40 repetitions) 
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FIG. 15. Throughput scaling of the block(GF(p)/RSH prim- 
itive combination for an increasing amount of CPUs. The 
speedup is well below one even for a moderate number of in- 
volved cores, and sees a further slow-down in the many-core 
case. Nonetheless, data rates that are reasonable for practical 
application are obtained, making the primitive combination 
a viable choice to post-process the output of slow devices for 
which it is important to not sacrifice valuable entropy, as is the 
case for the faster XOR extractor. While the per-core mea- 
surement in Figure [14] is more interesting from a scalability 
point of view, this figure provides guidance to what ressources 
are necessary to satisfy given experimental constraints. 



Hmin(X\E) p > fc, and for a uniform seed Y, we have 



1 



2 \\PExt(X,Y)YE - TV <» pY <& PE\\tr 



< e, 



where tjj is the fully mixed state on a system of dimen- 
sion 2 m . 

The function Ext is a classical-proof (k,e)- strong ex- 
tractor with uniform seed if the same holds with the sys- 
tem E restricted to classical states. 



Note that any conventional extractor (Definition A.l) 
is classical-proof with slightly weaker parameters. 



Lemma A. 3 ([351 Section 2. 5], [351 Proposition 1]). Any 
(k, e)-strong extractor is a classical-proof (fe+logl/e, 2s)- 
strong extractor. 



In the extractor constructions described in Section |HI} 
we are particularly interested in extractors which only 
need to process a few bits of the input for every bit of 
output. These extractors are called local, and defined as 
follows. 



(or i-local), if for every y E {0, l} d , the function x H > 
Ext(x,y) depends on only £ bits of its input, where the 
bit locations are determined by y. 

This notion of local extractors applies equally to ex- 
tractors with and without (quantum) side information. 



Appendix B: Known extractor results 

The next sections contain many known theorems on 
extractors, which we need to derive the parameters of 
the constructions from Section Hill 



1. List-decodable codes 

A standard error correcting code guarantees that if 
the error is small, any string can be uniquely decoded. 
A list-decodable code guarantees that for a larger (but 
bounded) error, any string can be decoded to a list of 
possible messages. 

Definition B.l (list-decodable code [10]). A code C : 
{0, 1}™ -> {0, l} fi is said to be (e, L)-list-decodable if 
every Hamming ball of relative radius 1/2 — e in {0, 1}™ 
contains at most L codewords. 

List-decodable error correcting codes are known to be 
1-bit extractors [201 [2TJ- This has been rewritten out 
explicitly in [J]. 

Lemma B.2 (g| Theorem D.^). Let C : {0,1}" 
{0,1}™ be an (e, L) -list-decodable code. Then the func- 
tion 

C : {0, 1}" x [n] {0, 1} 
(x,y) i-> C(x) y , 

is a (log L + log ^ , 2e) -strong extractor. 

As noted in a footnote of [4], this lemma can be 
strengthened to classical-proof extractors. 

Lemma B.3. Let C : {0, 1}" -> {0, 1}" be an (e, L) -list- 
decodable code. Then the function 

C : {0, 1}" x [n] -> {0, 1} 
(x,y) i-> C(x) y , 

is a classical-proof (logL + log 2e)-strong extractor. 



19 In the arXiv version, this theorem is numbered C.3 



Definition A. 4 (£- local extractor [21]). An extractor 
Ext : {0, 1}" x {0, l} d -> {0, l} m is l-locally computable 
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2. One-bit extractors 

Konig and Terhal [39 show that any one-bit extractor 
is quantum-proof. 

Theorem B.4 ([Ml Theorem III.l]). Let C : {0, 1}" x 

{0, 1}' — > {0, 1} be a (k,e)-strong extractor. Then C is a 
quantum-proof {k + log 1/e, 3-y/e) -strong extractor. 

If we however have a construction which has already 
been shown to be a classical-proof (fc, e)-strong extractor, 
then Theorem IB. 41 can be refined as follows. 



Lemma B.5 (Implicit in [35]). Let C : {0,1}™ x 
{0,1}* — ¥ {0,1} be a classical-proof (k,e)-strong extrac- 
tor. Then C is a quantum-proof (k, (1 + y/2) yfe)- strong 
extractor. 



3. Universal hashing 

A family of hash functions is almost universal, if the 
probability of a collision is low. 

Definition B.6 ([25). A family of hash functions {h : 
X — » Z} is said to be 5-almost universal (8-AU2), if for 
any x, x' € X with x ^ x', 



Pr[h(x) 

h 



h(x')} < 5, 



where the hash functions are chosen uniformly at ran- 
dom. 

The family is said to be universal , if it is S-AV2 with 

s = wr 

Tomamichel et al. 6J show that for such a family of 
hash functions {h y } y , the corresponding extractor — de- 
fined as Ext(x, y) :— h y (x) — is quantum-proof if 8 is 
small enough. 

Theorem B.7 ([6, Theorem 7]). If a family of hash func- 
tions {h : {0,1}" -> {0,1}™} is 8-AU2 for 8 = i±g£, 
then chosen uniformly at random, they build a quantum- 
proof (m + 4 log - + 1, 2e)-strong extractor. 



4. Trevisan's extractor 

In [31 Theorem 4.6], De et al. show that if a (k,e)- 
strong one-bit extractor is used in Trevisan's construc- 
tion, the final extractor is a quantum-proof (k + rm + 
log 1/e, 3m-y/e)-strong extractor, where m is the output 
length and r is a parameter of the weak design. 

That theorem is the combination of the following im- 
plicit lemma and Lemma A. 3 



Lemma B.8 (Implicit in 4J). Let C : {0, 1}" x {0, 1}* -> 
{0,1} be a quantum-proof (k,e)-strong extractor with 
uniform seed and S±,...,S m C [d] a weak (m,t,r,d)- 
design. Then Trevisan's extractor, Extc : {0, 1}™ x 
{0, l} d — > {0,1}™, is a quantum-proof (k + rm,me)- 
strong extractor. 



If we use a one-bit extractor which is known to 
be quantum proof, we get better parameters from 



Lemma B.8 than [U Theorem 4.6]. 



Appendix C: Weak design proofs 

1. Basic construction 

Lemma C.l. The weak design construction described in 
Section \lII B 1\ has r < 2e. 

Proof. Ma and Tan 18\ prove that if m £ [t c ,t c+1 ] and t c 
divides m, then the weak design has r < e. The lemma 
is thus immediate for m = kt c and any integer 1 < k < t. 

Let kt c < m < (k + l)t c for some integer 1 < k < t. 
Since the construction for m is the same as the construc- 
tion for m' = (k + l)t c with the last sets S p dropped, the 
overlap can only decrease. Thus 



E 

q<p 



2 \s q ns p \ < em , = (±±^ kt c < ttl em < 2em. 

k k 



□ 



2. Reducing the overlap 

Lemma C.2. The weak design construction described in 
Section UITET^ has r = l. 

Proof. For simplicity, we number the sets of the weak 
design W with two indices where < i < i and 1 < 

j < rrij, and label the corresponding set of the basic weak 
design Sj . We n eed to show that the second condition of 
Definition III.l holds for r = 1, namely that for all 



2l s ^ ns -l < m, 

(g,h)<(i,j) 

where {(5, h) : (g,h) < := \J g<i {(g,h) : h < m g }U 

{(i,h) :h<jV 
Note that Q implies that for all < k < I - 1, 



j<k j<k j<k 



from which we get 



m% < nj — nij + 1 < rife + 1. 

j<k j<k-l 



(CI) 



(C2) 



Furthermore, from the sum of a geometric series, we have 



E n J + r ' Uk = 



1 



1 



j<k-i 



r'no — m — r 



n 



r' li- 



no 
(C3) 
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For any two sets Sij and S 3i h with i ^ g, we have 
\Sg,h H 5!j,j| = 0. Thus for any set Sij with i < £ — 1, we 
have 

^ 2 |S a . h nS,, J | = i + ^2 |s ^ ns j | 

(g,h)<(i,j) g<i,h<m g h<j 

<^2 n g + l + r'{rii + 1) 

g<i 

= m + 1, 



where we used ( CI ) and ( C2 1 in the second from the last 



line, and ( C3 ) in the last line. Since the LHS of the above 



inequality is an integer, and the inequality is strict, we 
must have 

]T 2l s ^ n5 -l<m. 
{g,h)<(i,j) 



that me < t. This can be seen as follows. 



mi — m — > m, < m — > ;/ 



/ , ' 3 

j<l-\ 



/ j '-3 
j<l-l 



1-(1-^) (m 
m -, -rr- — — 1 



= r' + fl - ^\ (m- r'). 

By plugging ^ in this, we get me < i. Since i is the size 
of the finite field, the polynomial used to generate the 
elements of Sej has all coefficients 0, except the constant 
term which is j. We thus have Sj = {(^)i)}xGGF(t)) an d 
so the sets {Sej}j^GF(t) have no intersection. Hence 



{g,h)<(l,3) 



g<e 



Finally, for the case of Sej, note that i was chosen such 
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